Dual level pattern recognition system

ABSTRACT

A pattern recognition system has both coarse and fine levels of analysis in which a coarse array representation of a workpiece pattern is formed and used to identify the workpiece pattern as either a reference character or as a member of an ambiguous set of reference characters which are represented by identical coarse array representations. At least portions of a fine array representation of a workpiece pattern which has been classified as a member of a set of reference characters are compared to corresponding portions of fine array representations of reference characters in the set identified by the classification. The pattern recognition system has a learning system through which reference characters may be introduced into the recognition system and through which a representation of a workpiece pattern which was not identified may be incorporated into an existing set of reference characters of the recognition system.

BACKGROUND OF THE INVENTION

This invention relates to identifying patterns as corresponding to oneof a set of reference characters.

A number of pattern recognition systems have been devised in which bothcoarse and fine analyses have been made of a workpiece pattern. Onepatent, Yamamoto et al. (U.S. Pat. No. 3,829,831), discloses a system inwhich a workpiece pattern is converted into a signal derived from acoarse array representation of the workpiece pattern, and classified bycomparing the signal to a set of pattern-classifying signals. Theworkpiece pattern is then identified by comparing a signal derived froma fine array representation of the workpiece pattern to a set ofreference signals derived from fine array representations of referencecharacters selected in the classification analysis.

SUMMARY OF THE INVENTION

In general, the invention features a pattern recognition system havingboth coarse and fine levels of analysis in which a coarse arrayrepresentation of a workpiece pattern is formed and used to identify theworkpiece pattern as either a reference character or as a member of anambiguous set of reference characters which are represented by identicalcoarse array representations. At least portions of a fine arrayrepresentation of a workpiece pattern which has been classified as amember of a set of reference characters are compared to correspondingportions of fine array representations of reference characters in theset identified by the classification. A workpiece pattern which is notidentified as a reference character during a first coarse and fine arrayanalysis is re-encoded into coarse and fine array representations andanalyzed again. The pattern recognition system has a learning systemthrough which reference characters may be introduced into therecognition system and through which a representation of a workpiecepattern which was not identified may be incorporated into an existingset of reference characters of the recognition system.

The coarse array representation of a reference character may, accordingto the invention, operate as an address to an entry in a first database.Each first database entry contains a multiplicity field giving thenumber of reference characters in the set represented by the coarsearray representation. Each first database entry contains a second fieldgiving a code for a reference character having an unambiguous coarsearray value or an address to an entry in a second database for a patternhaving an ambiguous coarse array value. There is a second database entryfor every reference character which is not unambiguously defined by acoarse array representation. Each second database entry contains anidentifying code for its reference character and at least portions ofthe fine array representation of the reference character for use inresolving ambiguities between the reference character and the otherreference characters represented by the same coarse array. Seconddatabase entries for each reference character represented by identicalcoarse arrays are grouped together sequentially within the seconddatabase.

Also according to the invention, the pattern learning system may accepta workpiece pattern as a new reference character by comparing the coarsearray representation of the workpiece pattern to the coarse arrayrepresentations of an existing set of reference characters. For newreference characters for which no identical coarse array representationis found, an identifying character code is stored in a first databasewith the new coarse array representation of the reference character atits address. For a new reference character for which an identical coarsearray representation is found, second database entries for a new set ofreference characters represented by identical coarse arrays aregenerated, and a first database entry containing the number of referencecharacters in the new set and the address of the first of the seconddatabase entries for the set is established.

The pattern learning system may use a measure of distinctiveness forisolating the most distinctive portions of fine array representations ofreference characters represented by identical coarse arrays; the mostdistinctive portions are used by the pattern recognition system toresolve ambiguities between reference characters represented byidentical coarse arrays.

Advantages of the pattern recognition system of the present inventionover prior art systems include shorter processing time for eachcharacter recognition analysis resulting from the fewer charactercomparisons necessary to unambiguously identify a workpiece character.Specifically, for reference characters having unambiguous coarse arrayrepresentations, only coarse analysis need be performed. Further, sincereference characters represented by identical coarse arrays areorganized into sets of reference characters, called ambiguous sets, andfine analysis is confined to the members of the one ambiguous set whichshares the same coarse array representation as the workpiece character,the number of members of the ambiguous set is the maximum number ofcharacters examined during fine analysis.

On an even more fundamental level, since the coarse analysis isconducted using a coarse array scheme, the number of picture elementswhich are analyzed to identify a workpiece character or to classify itinto a set is limited to the number of picture elements in the coarsearray.

Further, in fine analysis, only the most distinctive portions of a finearray representation of a reference character are examined; therefore,the number of picture elements which identify a workpiece character nothaving a distinctive coarse array representation is limited to thenumber of picture elements in the most distinctive portions of the finearray representations of the reference character.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a block diagram of an apparatus for operating the patternrecognition system of the present invention.

FIG. 2 shows a block diagram of the D2 Character Recognizer shown inFIG. 1.

FIG. 3 shows a block diagram of an apparatus for operating the patternlearning system of the present invention.

FIG. 4 shows diagrammatically the coarse and fine arrays into which eachcharacter is organized.

FIG. 5 shows a flow charted representation of the program which is usedto compress a fine character array into a course character array.

FIG. 6 shows a flow charted representation of the pattern learningsystem for generating sets of reference characters represented byidentical coarse arrays and entries in a first database for referencecharacters having unambiguous coarse array representations.

FIG. 7 shows a flow charted representation of the pattern learningsystem for generating first database and second database entries forreference characters represented by identical coarse arrays.

FIG. 8 shows a more detailed flow chart of second database entrygeneration, as shown in FIG. 7.

FIG. 9 shows a more detailed flow chart of distinctive cell selection,as shown in FIG. 8.

FIG. 10 shows diagrammatically one of the twelve learning matricesconstructed during the pattern learning system for generating D2RAMentries.

FIG. 11 shows diagrammatically the maximum discrimination matrix whichis constructed from the twelve learning matrices.

FIG. 12 shows an example of digitizing and compressing a typicalcharacter.

FIG. 13 shows an example of generating second database entries forcharacters represented by identical coarse arrays.

FIG. 14 shows the structure by which the entries of the first and seconddatabases are organized.

DETAILED DESCRIPTION

There is shown in FIG. 1 a block diagram of an apparatus for operatingthe character recognition system 10 of the present invention. The system10 has a digitizer 12 with which a pattern is quantized into a 6×8 arrayof binary elements (the fine array), the compressor 14 in which a 48element fine array is compressed into a 3×4 array of binary elements(the coarse array), and a character name inputter 20 with multiplexer 22for accepting the name of the character once a workpiece character isidentified.

Pattern recognizing system 10 has a first database D1RAM. Each entrythereof being addressed by a 12-bit value of a coarse array. Each entryin D1RAM has a multiplicity field of three bits which contains thenumber of characters in the reference set which have the coarse arrayvalue of the entry address. Each entry in D1RAM has a second field withcontents which depend on the value in the multiplicity field of theentry. When the entry in the multiplicity field of the entry in 1(meaning there is only one reference character with a coarse array valueequal to the entry address) the second field contains the identifyingcode of the single reference character. When the value in themultiplicity field is greater than 1 (meaning there are a plurality ofreference characters with coarse array values equal to the entryaddress). The second field contains an address to D2RAM.

D2RAM contains an entry for each reference character set which has acoarse array value equal to that of any other character. (That is allthat do not have unshared coarse array values.) All characters with acommon value of coarse array value are called an ambiguous set and theentries therefor are contiguous in D2RAM. The first address for theentries of an ambiguous set is that given for the set in D1RAM.

Each D2 entry is organized into seven fields over two words of D2RAMmemory. In the first word of a D2 entry, the first and third fieldscontain information identifying the two subsets of the 48 element finearray of that reference character which are most discriminating withrespect to the other members of its ambiguous set; the second and fourthfields contain the values of the identified subsets. The fifth and sixthfield in the second word of the D2 entry are reserved; the seventh fieldcontains the identifying code for the reference character.

Returning to FIG. 1, the apparatus for use in coarse analysis has D1RAM,a first database address register D1AR arranged to be loaded with thecoarse array representation of an workpiece character, a first databasebuffer register D1BR arranged to accept a 16 bit word from D1RAM, and afirst database D1 character recognizer 16 arranged to be loaded with thefirst field of the D1 entry.

The apparatus for use in fine analysis has D2RAM, a second databaseaddress register D2AR arranged to be loaded with the second field of aD1 entry, a second database buffer register D2BR arranged to accept a 16bit word from D2RAM, and a second database character recognizer 36arranged to be loaded with the fine array representation of a workpiecepattern and the first word of a D2 entry.

As seen in more detail in FIG. 2, D2 character recognizer 36 has 12 fourbit cell registers 40 in which the 48 element fine array representationof the workpiece pattern is stored temporarily during both coarse andfine analysis, a 16 bit register 44 into which the first word of a D2RAMentry is loaded from D2BR, decoders 42, 52 arranged to select two of thecell registers 40 from address location information stored in sections Iand III of register 44, a corresponding bit comparer 46 arranged to doan exclusive-NOR comparison on the contents of the two registers 40selected by decoders 42, 52 and the element values stored in sections IIand IV in register 44, and a disambiguation impossibility recognizer 90arranged to be enabled by decoders 42, 52 to indicate that workpiecepattern can not be identified unambiguously but may be one of two ormore members of an identified ambiguous set.

In FIG. 3, a block diagram is shown of an apparatus for operating thecharacters learning system 50 of the present invention. Learning system50 shares, with recognition system 10, controls 30, digitizer 12,compressor 14, D1RAM, D1AR, D1BR, D2RAM, D2AR and D2BR. Learning system50 has an input counter 54 arranged to be loaded with the number ofreference characters to be encoded and inputted into pattern recognitionsystem 10, a character name selector 56 for inputting an identifyingbinary code CH(0-7) for each inputted reference character, and (normallyoffline) learning list storage 60 for storing the coarse and fine arrayrepresentations and CH(0-7) for every reference character. List storage60 has a set counter 62 arranged to be loaded with the number ofreference characters in storage 60.

Character learning system 50 has a character learner 70 arranged toaccept the first field of a D1BR entry into ambiguity counter 64, thesecond field into code register 76, and CH(0-7) and the contents of D2ARinto D1 entry generator 58, in order to generate a D1 entry in D1 entrygenerator 58 based on the value of ambiguity counter 64. Characterlearner 70 also has D1 entry counter 74 arranged to be loaded with thenumber of entries in D1RAM.

For generating D2 entries, character learner 70 is arranged to acceptCH(0-7) and the contents of code register 76 into ambiguous set storage66 in order to isolate sets of reference characters represented byidentical coarse arrays. Character learner 70 has D2 entry generator 68arranged to accept the fine array representations of every member of anambiguous set as stored in ambiguous sets storage 66 in order todetermine the most descriminating portions of each reference characterwith respect to the other members of its ambiguous set and, accordingly,generate entries for D2RAM.

OPERATION Character Recognition System

In operation, the character recognition system 10 uses digitizer 12 toencode workpiece character, into a 48 element fine array of binarynumbers representative to the positional distribution of optical densitywithin the pattern.

The character is baseline and left-most justified with respect to atemplate with a 6×8 array of picture fields, where the descendingportion of a character occupies the lower two rows of the array, and anascending portion occupies the first two rows. A value 1 is assigned toan element of the coarse array if any part of a character appears in thecorresponding field of the template and otherwise a 0 is assigned. Thefine array, which is temporarily stored in cell registers 40 within D2character recognizer 48, with values derived from one block of fourpicture fields in each register, is compressed into a 3×4 coarse arrayby operation of compressor 14, which converts the 48-bit output valueDG(0-47) of digitizer 12 into the compressed 12-bit output value of thecompressor 14. The conversion is equivalent to redefining the size ofthe fields of the template grid. As seen in FIG. 4, each element ofCP(0-11) represents the same area as a block of four picture elements ofDG(0-47). Recognition system 10 employs a data compression program 15which essentially establishes a value of "1" to a compressed pictureelement, hereinafter called a cell, if any of the corresponding fourpicture elements in the fine array equals "1". A data compressionprogram written in Algol (and flowcharted in FIG. 5) follows. In boththe program and the flowchart, "a" is a picture element of the finearray, "c" is a cell of the course array, and "d" is a row counter forthe coarse array.

    ______________________________________                                        Program: Data Compression                                                     ______________________________________                                        a = 0                                                                         DO d = 0, 1, 2, 3                                                             DO a = 2, a+2, a+4                                                            c = (a-6*d)/2                                                                 IF DG(a) OR DG(a+1) OR DG(a+6) OR DG(a+7)=1,                                  THEN CP(c)=1; ELSE                                                            CP(c)=0                                                                       a = a+8                                                                       ______________________________________                                    

Examples of digitized compression are shown in FIG. 11, where the letter"T" of a Courier 12 font is shown superimposed on a 6×8 elementdigitizing grid and in fine and course array representations.

For the coarse analysis, controls 30 load D1AR with CP(0-11), the coarsearray representation of the workpiece pattern. CP(0-11) operates as anaddress to a D1RAM entry, which is then read into D1BR. Controls 30cause D1 character recognizer to be loaded with the first three bits ofD1BR, hereinafter called N, which contain the multiplicity, of referencecharacters represented by the coarse array CP(0-11).

If the first three bits D1BR(0-2) equal zero, then no referencecharacter in the current recognition system is represented by the coarsearray of the workpiece pattern, D1 character recognizer 16 generates asignal X0, upon receipt of which controls 30 order a second encoding,decompression, and analysis of the workpiece character. If recognitionsystem 10 is unable to identify the workpiece pattern after a secondanalysis and the signal X0 is generated twice in a row, controls 30cause character name inputter 30 to accept an "unidentified character"code. The operator of recognition system 10 may elect to incorporate theworkpiece pattern into the set of reference characters by triggering thepattern learning system 50, described below.

If D1BR(0-2) equals one, the coarse array, defined by CP(0-11)represents only one reference character. D1 character recognizer 16generates a signal X1 which causes character name inputter 20 to beloaded with D1BR(8-15), the identifying binary character code for thereference character represented by the coarse array CP(0-11). Controls30 then trigger digitizer 12 to digitize the next workpiece characterfor input into recognition system 10.

If D1BR(0-2) is greater than one, the coarse array CP(0-11) represents adistinct ambiguous set of reference characters. D1 character recognizer16 generates the signal X2 which triggers a fine analysis to resolve theambiguity. Controls 30 cause N, the number of the reference charactersin the ambiguous set so identified, to be transferred from D1 characterrecognizer 16 to ambiguity down counter 48 within recognizer 36.Controls 30 also cause D2AR to be loaded with D1BR(3-15), the addressthe first D2RAM entry of the ambiguous set so identified. The first wordof the D2RAM entry addressed by D2AR is read first into D2BR and theninto register 44 of the D2 character recognizer 36.

As described above, the first word of a D2RAM entry for a referencecharacter contains the locations and element bit values of the mostdistinctive portions of the reference character with respect to theother members in its ambiguous set. In this embodiment, the portionsanalyzed for distinctiveness by character learner 70 correspond to cellsof a coarse array, one of which is shown in FIG. 4. Specifically, eachof the first and third fields in a D2 entry contains a binary numberfrom 0 to 11, identifying which two of the twelve cells of a referencecharacter are sufficiently distinctive to resolve ambiguities betweenthe reference character and other reference characters in its ambiguousset. The second and fourth fields contain the element bit values ofthose distinctive cells. The first and third fields may also contain thenumber 15, indicating a reference character of an ambiguous set whichcan not be unambiguously identified.

As seen in FIG. 3, controls 30 cause the first and third field of a D1entry to be loaded into sections I and III of register 44, and thesecond and fourth fields to be loaded into sections II and IV ofregister 44. Decoders 42, 52, attached to the outputs of section I andIII, decode the first and third fields of the D2 entry and, if thebinary numbers are between 0 and 11, enable the reading of two of thetwelve temporary registers 40; if decoders 42, 52 decode the number 15,they enable the disambiguation impossibility recognizer 90.

We return to character recognition by D2 character recognition 36. Asdescribed above, cell registers 40 store the 48 element fine arrayrepresentation of the workpiece pattern, one cell to each four bitregister 44. Decoders 42, 52 enable the reading of the two cellregisters 40 which correspond to the distinctive cells of the finearray. In corresponding bit comparer 46, the element bit values ofSection II and the cell register 40 enabled by decoded Section I and theelement bit values of Section IV and the cell register 40 enabled bydecoded Section III are compared, bit by bit, for equivalance byexclusive-NOR circuitry.

If both corresponding cells are identical, then the workpiece pattern isidentified as the reference character for which the present D2RAM entryhas been generated. Corresponding bit comparer 46 generates the signalY2 which causes controls 30 to increment D2AR so that the second word ofthe D2RAM entry, which contains the identifying binary character codefor the reference character, is read into D2BR. Y2 also enablescharacter name inputter 20 and multiplier 22 to accept the identifyingcharacter code D2BR(8-15) from D2BR, and triggers digitizer 12 todigitize the next workpiece character for identification.

If corresponding bit comparer 46 does not find identifical correspondingcells, it generates the signal Y1 which causes controls 30 to start afine analysis of the next reference character in the ambiguous set.Controls 30 decrement the ambiguity down counter 48; if it does notequal zero, the ambiguous set contains more reference characters againstwhich the workpiece pattern may be compared, so controls 30 incrementD2BR twice so as to access the first word of the D2RAM entry of the nextreference character in the ambiguous set. Fine analysis of referencecharacters continues until the workpiece pattern is identified or unitall of the members of the ambiguous set have been analyzed.

When ambiguous down counter 18 is zero and the disambiguation recognizer90 has been activated during analysis of the ambiguous set, D2 characterrecognizer 38 generates the signal Y0. Upon receipt of Y0, controls 30cause D2AR to be incremented and the character code in D2 entry fieldseven to be inputted into character name inputter 20 along with anambiguity indicator. In this manner, a similar if not unambiguouslyidentified reference character is included in the system output for usein visual inspection by the operator.

When ambiguity down counter 48 equals zero, and the disambiguationimpossible recognizer 90 has not been activated during analysis of theambiguous set, controls 30 order a second encoding, compressing, andanalysis of the workpiece character. If character recognition system 10is unable to identify the workpiece pattern after the second try,controls 30 cause D2AR to be incremented and the character codecharacter in D2 entry field seven to be loaded into character nameinputter 20 along with an ambiguity indicator. The operator ofrecognition system 10 may choose to incorporate the workpiece characterinto the reference set by triggering 10 character learning system 50.

CHARACTER LEARNING SYSTEM 50

Whether the character learning system 50 is triggered to add one or morenew characters to an existing reference character set or to incorporatea completely new reference set into D1RAM and D2RAM, the procedureswhich system 50 follows are identical. The only difference between thetwo is that the operator of system 50 clears offline learning liststorage 60, which normally holds all data on the reference set, and setcounter 62, which normally holds the number of patterns in the referenceset, if an entirely new reference set is to be inputted.

Flowcharted procedures for character learning system 50 are shown inFIGS. 6-9. As seen in FIG. 6, which shows the procedure for generatingD1 entries for reference characters having unambiguous coarse arrays,the operator of character learning system 50 loads into input counter 54the number of characters to be inputted. The number in counter 54 thusis either the number of characters to be incorporated into an existingset reference or the number of members of a new reference set. Theoperator of learning system 50 then places the first character leftmostand baseline justified onto the 6×8 element digitizing grid of digitizer12. Control is transferred to compressor 14, which comprises the finearray DG(0-47) into the 12 element coarse array CP(0-11) with the samecompression program described above and shown in FIG. 5. System 50pauses until the operator inputs, through character name selector 56,the binary coded character name CH(0-7) of the new reference character.

Controls 30 then cause the new reference character to be added to thereference set by loading the CH(0-7), CP(0-11) and DG(0-47) into offlinelearning list storage 60 and incrementing set counter 62. Controls 30then decrements input counter 54. If input counter 54 has not reachedzero, there are more characters to add to list storage 60, and controlis again transferred to digitizer 12.

If input counter 54 is zero, then all new characters have been added tolist storage 60; controls 30 cause character learner 70 of learningsystem 50 to generate D1RAM and D2RAM entries for every referencecharacter stored in learning list storage 60. Controls 30 generatesignals to clear D1RAM, D2RAM, code register 76, and D1 entry counter74. Controls 30 also load input counter 54 with the contents of setcounter 62 so that the number of reference characters that are beingentered into character recognition system 50 may be monitored throughinput counter 54.

Controls 30 then cause the character code CH(0-7) for the firstcharacter in storage 60 to be read and loaded into D1 entry generator 58within character learner 70. The coarse array CP(0-11) of the firstcharacter is loaded into D1AR, where it operates as an address to aD1RAM entry, which is read into D1BR. Controls 30 cause ambiguitycounter 64 within character learner 70 to be loaded with the first bitsof D1BR, so that ambiguity counter 64 contains the number, calledhereinafter "N", of reference characters represented by the coarse arrayCP(0-11). Controls also cause the remainder of D1BR, D1BR(3-15) to beloaded into code register 76 with character learner 70.

When the character code, hereinafter called "P", in code register 76 isequal to zero, no previously inputted reference character has beenrepresented by the course array CP(0-11) currently stored in D1AR; D1entry generator 58 generates a D1 entry for the reference characterbeing inputted. D1 entry counter 74 is incremented to include the new D1entry, while ambiguity counter 64 is incremented and N written intoD1BR[0-2]; the character code CH(0-7) which had been stored in D1 entrygenerator 58 is written into D1BR[8-15].

When P does not equal zero (indicating that a D1 entry had beengenerated for an earlier-introduced reference character also representedby coarse array CP(0-11), character learner 50 operates to either set upan ambiguous set for the reference characters represented by P andCH(0-7) or to add CH(0-7) to an already established ambiguous set ofwhich the reference character represented by P is a part.

To determine whether an ambiguous set has already been established, N,the number of reference characters represented by the coarse arrayCP(0-11), is examined.

If N is greater than 1, then an ambiguous set having P as a member hasalready been established in ambiguous set storage 66; controls 30 causeCH(0-7) to be added in storage 66 to that ambiguous set. If N equals 1,no such ambiguous se has been established; controls 30 cause anambiguous set to be formed, with P as its first member and CH(0-7) asits second member. In both cases, ambiguity counter 64 is incrementedthe include the new reference character, and N is loaded into D1BR[0-2].

At this point, whether or not the coarse array CP(0-11) was found torepresent an unambiguous pattern or an ambiguous set, the new D1 entryis written into D1RAM at the location addressed by CP(0-11). Inputcounter 54 is then decremented; if it is not equal to zero, list storage60 contains more reference characterst to be entered into D1RAM.Controls 30 cause the next character in from storage 60 to be read intoD1AR and character learner 70. If input counter 54 is zero, the entirereference set has been read from storage 60; controls 30 trigger theportion of the learning system 50, shown in FIG. 8, which generates D2entries and final D1 entries for ambiguous reference characters.

As seen in FIG. 7, controls 30 causes the value of 1 to be loaded intoD1AR and D2AR, so that both the first and second database both addressregisters are set to reference their first available memory spaces.Controls 30 cause the D1RAM location so addressed to be read into D1BR.Ambiguity counter 64 is then loaded with D1BR[0-2], and its contents Nare examined. If N equals 0, no reference character in storage 60 isrepresented by the coarse array in D1AR; neither D1 nor D2 entries aregenerated. Character learner 70 accesses the next D1 entry.

Specifically, if D1 entry counter 74, which holds the number of D1entries which have not been analyzed, equals 0, there are on more D1entries from which D2 entries may be generated. The D2 entries for everyambiguous set have been generated, so character learning system 50 ends.

If D1 entry counter 74 is non-zero, at least one D1 entry has not yetbeen analyzed; controls 30 cause D1AR to be incremented, and the nextaddressable D1 location is read into D1BR. If the first field of the D1entry, i.e. N, is equal to one, the coarse array courrently in D1AR isrepresentative of the single reference characters identified byD1BR[8-15]; a D2 entry is not generated. Since the D1 entry is in finalform, controls 30 cause the next addressable location in D1RAM to beanalyzed.

If, instead, N is greater than one, the coarse array CP(0-11) currentlyin D1AR represents more than one reference character. Character learner70 preceeds to generate sequential D2 entries foe each of the referencecharacters represented by CP(0-11) and a D1 entry addressing the firstof those D2 entries. Character learner 70 loads code register 76 withD1BR(8-15) and loads D1BR(3-15) with D2AR, the address of the firstavailable location in D2RAM. Controls 30 writes D1BR into the D1RAMentry addressed by D1AR, and then generates the D2 entries for everymember of the ambiguous set of which P is a member.

As described above, there is a D2 entry for every reference characterwhich is not represented by an unambiguous coarse array. Each D2 entryconsists of seven fields over two D2RAM words, with the first fourfields containing information as to the location and element values ofthe two most distinctive portions of the reference character in issuewith respect the other members of its ambiguous set. The second word ofthe D2 entry has a reserved fifth and sixth fields and a seventh fieldwhich contains the identifying binary character code for the referencecharacter in issue.

As seen in FIG. 8, which is a more detailed flow chart of the D2generation portionof FIG. 7, controls 30 cause character learner 70 toread ambiguous set storage 66 for a list of the binary codes CH(0-7) ofall of the members of the ambiguous set of which P, as stored in coderegister 76, is a member. Controls 30 then cause character learner 70 toread, from offline learning list storage 60, the fine arrayrepresentations DG[0-47] for every member of the isolated ambiguous set.

Character learner 70 then starts the procedure by which most distinctivecells of each member of the ambiguous set are isolated. It examines eachcharacter individually and generates a measure of relative celldistinctiveness for every cell of that character with respect to everymember of the ambiguous set. It organizes the cell distinctivenessmeasures into twelve matrices M₀ -M₁₁, one matrix for each of the twelvecells in a character.

A representative matrix M_(K), K=0,1, . . . 11, is shown in FIG. 10.Each element of M_(K), called M_(K) (i,j,) is a measure of each celldistinctiveness for cell K in character i with respect to character j.Each M_(K) is of size N×N, having one column and one row for everymember of the ambiguous set. Since cell distinctiveness of character iwith respect to character j is the same as that of character j withrespect to character i (i.e., M_(K) (i,j)=M_(K) (j,i)) and a charactercompared to itself has zero distinctiveness, each matrix M_(K) issymmetric with zeroes in its diagonals.

In this embodiment, the measure of cell distiveness is the number ofnon-identical elements within corresponding 4-element cells of twocharacters. In other words, M_(K) (i,j) is the summation of anexclusive-OR comparison of each of the four bits in corresponding cellsK of two characters i, j. Using this measure the value 0 indicates thatthe corresponding cells are identical in i and j and a value of 4indicates that corresponding cells K are totally dissimilar, having thegreatest amount of relative cell distinctiveness.

As an illustration, FIG. 13 shows an ambiguous set consisting of D, C,and O, each of which are represented by the common coarse arrayCP(0-11)=(111101111000), shown in FIG. 13.e. The twelve learningmatrices for D, C, and O are shown in FIG. 13.f. As can be easily seen,each element of the matrices M₉ -M₁₁ for character cells below thedescencer line, is zero. Further, all of the elements of M₁, M₂, M₃, M₄,M₇, and M₈ are zero because corresponding cells 1, 2, 3, 4, 7, and 8 inD, C, and O are identical.

To generate M₀, character learner 70 examines cell 0 in each of thereference characters D, C, and O. Since cell 0 in C is identical to cell0 in O, M₀ (C,D)=M₀ (O,C)=0, Further, M₀(D,C)=(1+0)+(1+1)+(1+0)+(0+1)=2=M₀ (C,D), and M₀ (D,O)=M₀ (O,D)=2.

Returning to FIG. 8, after all twelve learning matrices are generatedcharacter learner 70 isolates the maximum cell distinctiveness valuemaxM_(K) (i) for every character in matrix M_(K) as seen in FIG. 10,there are N maximum cell distinctivenss values for every cell K. Asshown in FIG. 11, character learner 70 organizes all of the maximum celldistinctiveness values into a maximum distinctiveness matrix of sizeN×12 for use with the 12 learning matrixes in isolating, for eachcharacter in the ambiguous set, the two most distinctive cells,hereinafter called K₁ and K₂, which will be used by characterrecognition system 10 to resolve ambiguities between that character andthe other members of its ambiguous set.

It should be noted that K₁ and K₂ are not selected solely on the basisof their maximum cell distinctiveness values. In some ambiguous sets,the two cells of a pattern having the numerically greatest maximum celldistinctiveness values can not be used to resolve ambiguities within theset, because at those cell locations, the structure of the pattern isidentical to the structure of another pattern in its ambiguous set.

An example of this is seen in FIG. 13, in which cell 0 and cell 6 ofcharacter C have the numerically greatest maximum cell distinctivenessvalues (i.e., 2) because of the structural differences between thecharacters C and O, but can not be used to resolve differences between Cand O because they are identical to cell 0 and cell 6 of character O.

Keeping all of the above in mind, we turn to the manner in whichcharacter learner 70 makes K₁ and K₂ selection. Specifically, K₁ and K₂are selected as the most distinctive cells of a reference character Pbased on a large maximum distinctiveness value and the requirement thatboth of the cells K₁ and K₂ of character P can not be identical to thecells K₁ and K₂ of any other pattern in its ambiguous set.

As seen in FIG. 9, character learner 70 first generates the maximumdistinctiveness matrix row for the character identified by P. Analyzingthat row and the 12 learning matrices, character learner 70 selects K₁and K₂, the address and contents of which are loaded into D2 entrygenerator 88.

The first step in selecting K₁ and K₂ is to determine whether there areat least two values of maxM_(K) (P) which are non-zero; if not, K₁ andK₂ selection for character P is impossible, and ambiguities between Pand the other members of the ambiguous set cannot be resolved. Controls30 cause D2 entry register 88 to be filled with all ones so that, whenthe D2 entry for P is accesses during use of pattern recognition system10, values of fifteen in Sections I and III of D2 character register 44cause disambiguation impossibility recognizer 90 to be activated.

If there are at least two non-zero maxM_(K) (P) values, characterlearner 70 selects, as an initial K₁, the first cell in the maximumdistinctiveness matrix row that has the largest value of maxM_(K) (P).The cell selected as initial K₁ is then excluded from further cellselection for two reasons: to avoid selecting one cell as both K₁ and K₂and to keep track of which cell combinations have been tested, for K₁,K₂ ambiguity-resolving suitability.

Character learner then selects, as an initial K₂, the first cell in thematrix row having the largest value of maxM_(K) (P). The K₁, K₂ cellcombination is then tested for ambiguity-resolving suitability.Character learner 70 compares learning matrices M_(K1), M_(K2). If anyother character i in the ambiguous set is identical to P at cells K₁ andK₂, then M_(K1) (P,i)=M_(K2) (P,i)=0, and character learner 70 rejectsK₂. If excludes the current K₂ from the set of eligible cells, checks tosee if maxM_(K) (P) is non-zero for at least one eligible cell, selectsa second K₂, and tests the new K₁, K₂ cell combination for suitabilityin resolving ambiguities between P and the other members of itsambiguous set.

If there had been no eligible cells for which maxM_(K) (P) was non-zero,then no more cells are presently available for selection as K₂. Thisdoes not mean that all cell combinations are unsuitable for ambiguityresolving. It only means that no suitable cell combinations include thecell initially selected as K₁ as a member; it is likely that acombination of two other cells, former K₂ selections, is suitable forresolving ambiguities.

Character learner 70 causes the cells earlier discarded as K₂ selectionsto be reincluded in the eligible cell set. To prevent retesting of cellcombinations, earlier K₁ cell selections continue to be excluded fromthe eligible cell set. Character learner 70 selects a new K₁ and testsit with the eligible K₂ cells for ambiguity-resolving suitability. Itcontinues this cycle until a suitable K₁ and K₂ are selected or until ithas tested every combination of cells having non-zero maxM_(K) (P)values.

If a suitable set combination of K₁, K₂ cells is not isolated, characterlearner 70 causes D2 entry register 88 to be filled with all ones sothat the first and second fields of the D2 entry indicate thatdisambiguation involving the character P is impossible.

If cells K₁ and K₂ are isolated, character learner 70 causes D2 entryregister 88 to be filled with the address and picture element values ofK₁, K₂.

An example of K₁, K₂ selection is illustrated in FIG. 13. Characterlearner generate the D row of the maximum distinctiveness matrix of FIG.13.e, and selects cell 0 and cell 6, having distinctiveness values of 2,as initial K₁ and K₂ of matrices M₀ and M₆ shows that M₀ (D,C), M₆ (D,C)and M₀ (D,O) and M₆ (D,O) are all non-zero, so that cells 0 and 6 are asuitable combination for use in resolving ambiguities between D and theother characters in the ambiguous set.

Character learner then generates the maximum distinctiveness matrix rowfor the character C. It also selects cells 0 to 6 as initial K₁ and K₂,but a review of M₆ and M₀ shows that M₆ (C,O)=M₀ (C,O)=0, indicatingthat the structure of characters C and O at cells 0 and 6 is identical.Character learner 70 excludes cell 6 from cell selection and choosescell 5, having a distinctiveness value of 1, as K₂, Since C is notidentical to either D or O at cells 0 and 5, there exist no M₀ (C,i) andM₅ (C,i) which both equal 0; cell 0 and 5 are a suitable cellcombination for resolving ambiguities involving C.

Similarly, for character O, the initial selection of cells 0 and 6 proveunsuitable because of the identical structure of characters C and O atcells 0 and 6. Character learner 70 selects the combination of cells 0to 5 to resolve ambiguities between O and the other member C and D ofits ambiguous set. The D2 entries for D, C and O, generates by characterlearner 70, are shown in FIG. 13.f.

Returning to FIG. 8, controls 30 cause D2 entry register to be loadedinto D2BR, and D2BR to be read into D2RAM at the location addressed byD2AR.

The second word of the D2 entry is then generated. D2AR is incrementedto provide access to the next memory word in D2RAM, and characterlearner 70 causes P to be loaded into D2BR[8-15]. (In this way, even ifambiguities between P and another member of its ambiguous set, assomewhat similar pattern, one represented by the same course array, isavailable to be loaded into character name inputter 20). D2BR is readinto D2RAM, and ambiguity counter 64 is decremented so that N includesone less unanalyzed reference character in the ambiguous set.

Character learner 70 continues to generate D2 entries for the rest ofthe members of the ambiguous set. As D2AR to access the next memorylocation in D2RAM and loads code register 76 with P, the next characterin the ambiguous set. When N becomes zero, character learner 70, whichhas generated D2 entries for all of the members of the ambiguous setdefined by CP(0-11) in D1AR, the next D1 entry in D1RAM in order toisolate the next ambiguous set.

Returning to FIG. 7, character learner 70 increments D2AR to access thenext available unfilled memory word of D2RAM and decrements D1 entrycounter 74 to include one fewer unanalyzed D1 entry; if D1 entry counter74 is non-zero, character learner 70 accesses the next addressablelocation in D1RAM.

If D1 entry counter is zero, character learner 70 has generated D1 and,if necessary, D2 entries for every reference character stored in thelist storage 60. Controls 30 cause character learner 60 to exit patternlearning system 50.

Many variations of the apparatus within the scope of the invention willbe apparent to a person skilled in the art.

For example, the first field of a D1 entry could be increased toaccommodate reference sets having ambiguous sets with many members.

The recognizing and learning systems and apparatus could be modified toaccommodate longer identifying character codes for reference charactersin a larger reference set.

The apparatus and systems could also be modified to accommodate largeror smaller fine arrays. Generally, as the number of rows and columns inan array increases, the number of picture elements in the arraydecreases, and the apparatus and systems would be modified accordingly.Assuming the same data compression scheme is used, the coarse arraywould also be modified in size.

In some reference sets, for example, in some type fonts, there are nocharacters which occupy both ascender and descender space in a characterwindow. In other words, there are no characters which occupy cells 0-2and cells 9-11 at the same time. For a recognition system designed forthese reference sets, a reference character may be represented by a 48picture element fine array and a 9 cell coarse array. Digitizer 12 andcompressor 14 would both still generate DG(0-47) and CP(0-11), butreference characters not occupying descender spaces would have coarsearrays corresponding to CP(0-8), while reference characters occupyingdescender space would have coarse arrays corresponding to CP(3-11). Asbefore, the pattern learning system would isolate ambiguous sets ofcharacters represented by initial coarse array, but, in this embodiment,where the coarse array would have 9 elements, ambiguous sets couldcontain seemingly dissimilar characters which are easily disambiguatedduring fine analysis of the 48 element fine array representationsDG(0-47).

In reference sets of this sort, the fine array representations couldalso be modified to contain only 36 elements. Reference characters notoccupying descender space would have fine arrays corresponding toDG(0-35), and reference characters occupying descender space would havefine arrays corresponding to DG(12-47).

The modifications necessary to the apparatus and systems of thepreferred embodiment to accommodate the above suggested arraymodification will be obvious upon a review of the figures and the abovedescription of the described embodiment.

What is claimed is:
 1. Apparatus for generating a machine readablesignal indicating which one of a set of reference characters correspondsto a visible pattern, includingmeans for storing signals indicative of aset of reference fine n-dimensional array values, each reference finearray value corresponding to one of the reference characters, means forstoring signals indicative of a set of reference coarse m-dimensionalarray values, where m is less than n, each reference coarse array valuebeing derived either from a signal one only of said reference fine arrayvalues (in which case it will be designated an unambiguous value), orfrom any one of a plurality of said reference fine array values (inwhich case it will be designated an ambiguous value), means formeasuring with an optical instrument a workpiece visible pattern in eachof n fields to provide a signal indicative of a workpiece finen-dimensional array value and means for storing said last named signal,means for forming a signal indicative of a workpiece coarsem-dimensional array value from said workpiece fine array value signal,first comparing means for comparing the signal of the workpiece coarsearray value with the signals of the set of reference coarse array valuesto produce a signal indicating the identification of the workpiecepattern with either an unambiguous reference coarse array value or anambiguous reference coarse array value, means responsive to a signalfrom said first comparing means indicative of identification with anunambiguous array value to generate a signal indicative of the referencecharacter associated with the identified unambiguous array value, secondcomparing means responsive to a signal from said first comparing meansindicative of identification with an ambiguous array value to comparesignals indicative of values of a preselected subset of elements fromthe plurality of reference fine array values associated with theidentified ambiguous coarse array value with the signals indicative ofvalues of corresponding elements of the workpiece fine array, togenerate a signal indicative of identification of the workpiece patternwith a particular one of the reference characters associated with theidentified ambiguous value.
 2. Apparatus as claimed in claim 1,including means for augmenting the set of reference characters with anunidentified workpiece pattern. .Iadd.
 3. Apparatus for generating amachine readable signal indicating which one of a set of referencecharacters corresponds to a pattern, includingmeans for storing signalsindicative of a set of reference fine n-dimensional array values, eachreference fine array value corresponding to one of the referencecharacters, means for storing signals indicative of a set of referencecoarse m-dimensional array values, where m is less than n, eachreference coarse array value being derived either from a signal one onlyof said reference fine array values (in which case it will be designatedan unambiguous value), or from any one of a plurality of said referencefine array values (in which case it will be designated an ambiguousvalue), means for measuring a workpiece pattern in each of n fields toprovide a signal indicative of a workpiece fine n-dimensional arrayvalue and means for storing said last named signal, means for forming asignal indicative of a workpiece coarse m-dimensional array value fromsaid workpiece fine array value signal, first comparing means forcomparing the signal of the workpiece coarse array value with thesignals of the set of reference coarse array values to produce a signalindicating the identification of the workpiece pattern with either anunambiguous reference coarse array value or an ambiguous referencecoarse array value, means responsive to a signal from said firstcomparing means indicative of identification with an unambiguous arrayvalue to generate a signal indicative of the reference characterassociated with the identified unambiguous array value, second comparingmeans responsive to a signal from said first comparing means indicativeof identification with an ambiguous array value to compare signalsindicative of values of a preselected subset of elements from theplurality of reference fine array values associated with the identifiedambiguous coarse array value with the signals indicative of values ofcorresponding elements of the workpiece fine array, to generate a signalindicative of identification of the workpiece pattern with a particularone of the reference characters associated with the identified ambiguousvalue. .Iaddend. .Iadd.
 4. Apparatus as claimed in claim 3, includingmeans for augmenting the set of reference characters with anunidentified workpiece pattern. .Iaddend. .Iadd.
 5. Apparatus forgenerating a machine readable signal indicating which one of a set ofreference characters corresponds to a pattern, includingmeans forstoring signals indicative of a set of reference fine n-dimensionalarray values, each reference fine array value corresponding to one ofthe reference characters, means for storing signals indicative of a setof reference coarse m-dimensional array values, where m is less than n,each reference coarse array value being derived either from a signal oneonly of said reference fine array values (in which case it will bedesignated an unambiguous value), or from any one of a plurality of saidreference fine array values (in which case it will be designated anambiguous value), means for providing a signal indicative of a workpiecefine n-dimensional array value and means for storing said last namedsignal, means for forming a signal indicative of a workpiece coarsem-dimensional array value from said workpiece fine array value signal,first comparing means for comparing the signal of the workpiece coarsearray value with the signals of the set of reference coarse array valuesto produce a signal indicating the identification of the workpiecepattern with either an unambiguous reference coarse array value or anambiguous reference coarse array value, means responsive to a signalfrom said first comparing means indicative of identification with anunambiguous array value to generate a signal indicative of the referencecharacter associated with the identified unambiguous array value, secondcomparing means responsive to a signal from said first comparing meansindicative of identification with an ambiguous array value to comparesignals indicative of values of a preselected subset of elements fromthe plurality of reference fine array values associated with theidentified ambiguous coarse array value with the signals indicative ofvalues of corresponding elements of the workpiece fine array, togenerate a signal indicative of identification of the workpiece patternwith a particular one of the reference characters associated with theidentified ambiguous value. .Iaddend. .Iadd.
 6. Apparatus as claimed inclaim 5, including means for augmenting the set of reference characterswith an unidentified workpiece pattern. .Iaddend.